Cancer Medicine Data Project

Johanna Haraldsdottir (s204657), Marie Kragh (s203566), Sofie Bruun (s194678), Amalie Schultz-Nielsen (s204643), and Malou Bech Jakobsen (s203515)

Introduction

  • The Drug data set provides information about various medications: from treatments of cancer and Alzheimer’s disease to common colds and pain relief

  • Purpose: Assist healthcare professionals and patients in making decisions when choosing a medication

  • The 3 rules for making a data set tidy are not satisfied

    • Each variable must have its own column

    • Each observation have its own row

    • Each value have its own cell

# A tibble: 11,825 × 2
   `Medicine Name`          Uses                                                
   <chr>                    <chr>                                               
 1 Avastin 400mg Injection  Cancer of colon and rectum Non-small cell lung canc…
 2 Augmentin 625 Duo Tablet Treatment of Bacterial infections                   
 3 Azithral 500 Tablet      Treatment of Bacterial infections                   
 4 Ascoril LS Syrup         Treatment of Cough with mucus                       
 5 Aciloc 150 Tablet        Treatment of Gastroesophageal reflux disease (Acid …
 6 Allegra 120mg Tablet     Treatment of Sneezing and runny nose due to allergi…
 7 Avil 25 Tablet           Treatment of Allergic conditionsTreatment of Respir…
 8 Aricep 5 Tablet          Alzheimer's disease                                 
 9 Amoxyclav 625 Tablet     Treatment of Bacterial infections                   
10 Atarax 25mg Tablet       Treatment of AnxietyTreatment of Skin conditions wi…
# ℹ 11,815 more rows
# A tibble: 11,825 × 3
   `Excellent Review %` `Average Review %` `Poor Review %`
                  <dbl>              <dbl>           <dbl>
 1                   22                 56              22
 2                   47                 35              18
 3                   39                 40              21
 4                   24                 41              35
 5                   34                 37              29
 6                   35                 42              23
 7                   40                 34              26
 8                   43                 28              29
 9                   36                 43              21
10                   35                 41              24
# ℹ 11,815 more rows

Materials and Methods

Cleaning

Materials and Methods

Augmentation

Description of the Cancer Data

  • Variables:
    Medicine_Name
    Manufacturer
    Administration_type
    Application type and number
    API name, amount, and unit
    Side_Effect
    Review_level and Review_% (Poor, Average, and Excellent)
    Counts of Side_Effects, API, and Application
    Classification_Review

  • Distribution of Side Effects

# A tibble: 1 × 4
  minimum maximum  mean median
    <dbl>   <dbl> <dbl>  <dbl>
1       1      23  13.1     10

The 10 most prevalent drugs

Analysis 1: Review Levels

What are the probable causes behind the excellent, average, and poor reviews for the cancer drugs?

Influence of Number of Side Effects

Analysis 1: Review Levels

Influence of Administration Type

Analysis 2: The 7 most prevalent medications

What is the relationship between the side effects and administration type for the 7 most prevalent drugs, compared to the overall administration types for all cancer medicines?

Number of Side Effects for the 7 most prevalent drugs

Number of side effects for a specific administration type

Analysis 3: Manufacturer in relation to review classifications

  • Distribution of the top 6 manufactures having most products on the market

  • Classification of review levels

  • In 4 out of 6 top manufactures a trend of most Excellent reviews, less Average and least poor reviews

  • Interestingly, Lupin Ltd have equal amounts of poor and excellent reviews.

Conclusion

  • There was no apparent tendency between the number of side effects and what type of review the specific drug got

  • The review levels do not seem to be dependent on the type of administration

  • The distribution of side effects seem random for different administration types for the 8 most prevalent drugs, although a rather high number of side effects are present for these drugs

  • The 8 most prevalent drugs only have the administration types injection and tablet, 75% being injection

  • Administration through capsules is associated with a high number of side effects whereas administrations such as cream and lotion are associated with a very low number or no side effects.

  • The top 6 manufactures predominantly have excellent reviews

Discussion

Keep in Mind

  • Representativeness of data

The Next Steps

  • Augmentation: New administration type categories [Topical, Oral, Parenteral]

  • Analysis of the cancer types

  • Analyzing other variables: Influence of API and number of applications

Cancer Medicine Data Project

Johanna Haraldsdottir (s204657), Marie Kragh (s203566), Sofie Bruun (s194678), Amalie Schultz-Nielsen (s204643), and Malou Bech Jakobsen (s203515)